Baseline covariates impact the outcome in many clinical trials. Although baseline adjustment is not always necessary, in case of a strong or moderate association between a baseline covariate(s) and the primary outcome measure, adjustment for such covariate(s) generally improves the efficiency of the analysis and avoids conditional bias from chance covariate imbalance.
Part 1: Continuous outcomes
If a baseline value of a continuous primary outcome measure is available, then this should usually be included as a covariate.
First, we will go over the overview slides on Supplemental Materials on course web page: Analyzing change from baseline in trials. The mean difference presented in the slides can be expressed using the following three linear regression model.
Models
For a continuous outcome \(Y\), also measured at baseline \(W\) and treatment group \(X\), consider the following linear regression models. Let \(Z = Y - W\), the change from baseline.
Model 1: Compare the mean final value by treatment group
\[
Y_i = \gamma_0 + \gamma_1*X_i + \epsilon_i
\]
Model 2: Compare the mean change, final minus initial, by treatment group.
\[
Z_i = \alpha_0 + \alpha_1 * X_i + e_i
\]
Model 3: Final value adjusted for baseline by treatment group
Q 1.1 When is model 1 preferred or model 2? That is, what values of \(\rho\) (correlation between baseline and followup values of outcome) lead to smaller variance under each model?
If we assume equal variance for the baseline and final values
The following table gives the calculated SD from the formulas for Models 1, 2 and 3 for values of rho ranging from -0.2 to 0.9. Models 1 is more efficient than model 2 for low correlations below 0.5; Model 2 is more efficient than model 1 for high correlations above 0.5; Model 1 and Model 2 are equivalent at rho of 0.5. Model 3, adjustment for baseline, is equivalent to Model 1 when rho is 0 (baseline and followup are not correlated). Otherwise, model 3 is always better than models 1 or 2. For very high correlations, Models 2 and 3 have similar efficiency. Model 3 improves efficiency if rho is positive or negative.
rho <-seq(-.2,.9,by=.1)sigma <-1# Model 1sd1 <-rep(sigma, length(rho))# Model 2sd2 <-sqrt(2*sigma^2*(1-rho))# Model 3sd3 <-sqrt(sigma^2*(1-rho^2))cbind(rho, sd1, sd2, sd3)
Assuming a correlation of 0.4 between baseline and follow-up pain scores, a clinically important difference of 1.8 NRS units, a standard deviation of 3 NRS units, power of 80% and significance level of 5%, the following sample sizes are required:
R code and results
Using the formulas above, we can find the standard deviation under the three analysis approaches (Final, Change, Regression adjusted for baseline)
Effect size is then difference in means (1.8) divided by the standard deviations
In R, input this effect size divided by the standard deviation (d) and either power (to output sample size) or sample size (to output power)
library(pwr)# For final values1 <-3# For changes2 <-sqrt(2*3^2*(1-0.4))s2
[1] 3.286335
# For linear regression (Stata calls this ANCOVA)s3 <-sqrt(3^2*(1-.4^2))s3
[1] 2.749545
# Final valuepwr.t.test(n=NULL, d=1.8/s1, power=.9)
Two-sample t test power calculation
n = 59.35155
d = 0.6
sig.level = 0.05
power = 0.9
alternative = two.sided
NOTE: n is number in *each* group
# Changepwr.t.test(n=NULL, d=1.8/s2, power=.9)
Two-sample t test power calculation
n = 71.02369
d = 0.5477226
sig.level = 0.05
power = 0.9
alternative = two.sided
NOTE: n is number in *each* group
# Regression adjusting for baselinepwr.t.test(n=NULL, d=1.8/s3, power=.9)
Two-sample t test power calculation
n = 50.01479
d = 0.6546537
sig.level = 0.05
power = 0.9
alternative = two.sided
NOTE: n is number in *each* group
Stata approach
Stata can give these same results, but doesn’t require you to know the standard deviation formulas given above for Models 1, 2 and 3
Following is the full Stata output for the example.
(You would have to know that ANCOVA is equivalent to the linear regression model adjusted for baseline)
. sampsi 0 1.8, sd(3) pre(1) post(1) r01(.4)
Estimated sample size for two samples with repeated measures
Assumptions:
alpha = 0.0500 (two-sided)
power = 0.9000
m1 = 0
m2 = 1.8
sd1 = 3
sd2 = 3
n2/n1 = 1.00
number of follow-up measurements = 1
number of baseline measurements = 1
correlation between baseline & follow-up = 0.400
Method: POST
relative efficiency = 1.000
adjustment to sd = 1.000
adjusted sd1 = 3.000
Estimated required sample sizes:
n1 = 59
n2 = 59
Method: CHANGE
relative efficiency = 0.833
adjustment to sd = 1.095
adjusted sd1 = 3.286
Estimated required sample sizes:
n1 = 71
n2 = 71
Method: ANCOVA
relative efficiency = 1.190
adjustment to sd = 0.917
adjusted sd1 = 2.750
Estimated required sample sizes:
n1 = 50
n2 = 50
. di sqrt(1-0.4^2)
.91651514
. di 3*sqrt(1-.4^2)
2.7495454
. di sqrt(2*1*(1-0.4))
1.0954451
Part 2: Binary outcomes
For generalized linear models or non-linear models, adjusted and unadjusted treatment effects may not have the same interpretation and, sometimes, different results may be obtained from adjusted and unadjusted analyses. Thus, the choice of the appropriate covariates and the pre-specification of the primary model are critically important.